Using Microsoft SQL Server platform for plagiarism detection
نویسندگان
چکیده
The paper presents an approach for plagiarism detection using Microsoft SQL Server platform in a large corpus of documents. The approach was used for participation in the first international plagiarism detection competition that was held as a part of PAN’09 workshop. The main advantages of the proposed approach are its high precision, good performance and readiness for deployment into a production environment with relatively low cost of the required third party software. The approach uses fingerprinting-based algorithm to compare documents and Levenstein’s metric to markup plagiarized fragments in the texts.
منابع مشابه
Health Ontology Generator: Design And Implementation
This paper presents the design and implementation of a Health Ontology Generator (HOG) using a health database such as Microsoft Access or SQL Server. The development of the ontology generator involves building methods for creating and reading the ontology. This research performs both these tasks. In generating the ontology, database tables are treated as classes, fields as functional propertie...
متن کاملSQL Server Megaservers: Scalability, Availability, Manageability
Microsoft® SQL ServerTM has evolved to support huge databases and applications, including multiterabyte databases used by millions of people. SQL Server achieves this scalability by supporting scale up on symmetric multiprocessor (SMP) systems, allowing users to add processors, memory, disks and networking to build a large single node, as well as scale out on multinode clusters, allowing a huge...
متن کاملGeospatial Stream Query Processing using Microsoft SQL Server StreamInsight
Microsoft SQL Server spatial libraries contain several components that handle geometrical and geographical data types. With advances in geo-sensing technologies, there has been an increasing demand for geospatial streaming applications. Microsoft SQL Server StreamInsight (StreamInsight, for brevity) is a platform for developing and deploying streaming applications that run continuous queries ov...
متن کاملConstruction of Agricultural Products Logistics Information System Based on .Net and Wap
Functions and construction of agricultural products logistics system based on .NET and WAP technology are introduced in detail. The problems encountered during the process of system development and corresponding solutions are also illustrated. The Windows 2003 Server and SQL Server 2005 serve as the platform and background database server respectively, and windows are designed using the ASP.NET...
متن کاملA workflow mining approach for deriving software process models
Technical Skills • Programming Languages: Java, Javascript, C++, C, PL/SQL, XML, XSLT, HTML, Groovy, Scala, Pascal, Delphi, Prolog, Lisp, Assembler, Visual Basic, Perl, Shell, etc... • Component Architectures: Java EE (J2EE JEE6), OSGi /Equinox/Felix, Spring, Corba, Quasar • Java Libraries and Frameworks: Eclipse RCP, Eclipse RAP, Swing, Awt, JSF/Facelets, JavaFX, EJB 2.* 3.*, JMS, JAXB, XStrea...
متن کامل